Characterizing Conical Intersections in DNA/RNA Nucleobases with Multiconfigurational Wave Functions of Varying Active Space Size

We characterize the photochemically relevant conical intersections between the lowest-lying accessible electronic excited states of the different DNA/RNA nucleobases using Cholesky decomposition-based complete active space self-consistent field (CASSCF) algorithms. We benchmark two different basis set contractions and several active spaces for each nucleobase and conical intersection type, measuring for the first time how active space size affects conical intersection topographies in these systems and the potential implications these may have toward their description of photoinduced phenomena. Our results show that conical intersection topographies are highly sensitive to the electron correlation included in the model: by changing the amount (and type) of correlated orbitals, conical intersection topographies vastly change, and the changes observed do not follow any converging pattern toward the topographies obtained with the largest and most correlated active spaces. Comparison across systems shows analogous topographies for almost all intersections mediating population transfer to the dark 1nO/Nπ* states, while no similarities are observed for the “ethylene-like” conical intersection ascribed to mediate the ultrafast decay component to the ground state in all DNA/RNA nucleobases. Basis set size seems to have a minor effect, appearing to be relevant only for purine-based derivatives. We rule out structural changes as a key factor in classifying the different conical intersections, which display almost identical geometries across active space and basis set change, and we highlight instead the importance of correctly describing the electronic states involved at these crossing points. Our work shows that careful active space selection is essential to accurately describe conical intersection topographies and therefore to adequately account for their active role in molecular photochemistry.


Cartesian coordinates
Cartesian coordinates for all optimised minimum energy conical intersections can be accessed through the following DOI/Zenodo repository: 10.5281/zenodo.8348402.In the repository, you will find a zip folder for each conical intersection.Each zip folder contains two subfolders labeled as either DZ or TZ, depending on the basis set used in the optimization, with the cartesian coordinates for each geometry, with the following naming convention:

Molecular orbitals and active spaces
Molecular orbitals (MOs) considered for each nucleobase are shown in figures S1-S5.Based on this information, the different active spaces used in the optimization of each one of the conical intersections and information about Natural Orbital Occupation Numbers (NOONs) can be found in the tables S1-S36 where the notation of the orbitals corresponds to the notation used in the figures.Those tables which give information about NOONs have, for each active space, two different data sets for each of the states involved in the conical intersection.

Cytosine
MOs included in the cytosine calculations with an active space of (14,10) can be seen in Figure S1.For the other cases and for each of the conical intersections, we have different tables (S1,S3,S5,S7,S9) with information on which MOs were included in the calculation as well as other tables with information about how the occupation of those orbitals changes in the different optimizations.Table S1: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 ππ * /S 0 ) CI for cytosine.
(14,10) (12,9) X (10,8) X X (8,7) X X X (8,6) X X X X (8,5) X X X X X (6,4) X X X X X X (4,3) X X X X X X X (2,2) X X X X X X X X Table S2: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 ππ * /S 0 ) CI of cytosine.In green are marked those orbitals that participate in the conical intersection under study.
Table S3: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI for cytosine.
Table S4: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 n O π * / 1 ππ * ) CI of cytosine.In green are marked those orbitals that participate in the conical intersection under study.
Table S5: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * /S 0 ) CI for cytosine.
Table S6: Occupation numbers for each one of the molecular orbtials involved in the different optimizations of the conical intersection ( Table S7: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n N π * /S 0 ) CI for cytosine.
Table S8: Occupation numbers for each one of the molecular orbtials involved in the different optimizations of the conical intersection ( 1 n N π * /S 0 ) CI of cytosine.
Table S9: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n N π * / 1 ππ * ) CI for cytosine.
Table S10: Occupation numbers for each one of the molecular orbtials involved in the different optimizations of the conical intersection ( 1 n N π * / 1 ππ * ) CI of cytosine.
The first conical intersection of cytosine is ( 1 ππ * /S 0 ) CI .It be can seen that with larger active spaces, it is observed how one of the electrons in Homo (occ ∼2 in the ground state) goes to the Lumo where the NOONs change to ∼1 for both molecular orbitals (excited state).
As the active space is reduced, the occupation do not change as drastically, giving rise to states in which electronic transfer is not as evident.The other four conical intersections studied for cytosine do not appear to follow the same trend as the previous one.In these cases, the occupation of the orbitals is not very affected by the reduction of the active space (slight changes) however, these differences in the NOONs do not correlate with the results observed in the P vs B plots in any case.

Uracil
Similar to the case of cytosine, the orbitals included in the optimizations using an active space of (14,10) are shown in the figure S2.For the rest of the cases, the orbitals can be found in the tables S11-S16.Table S11: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 ππ * /S 0 ) CI for uracil.
Table S12: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 ππ * /S 0 ) CI of uracil.In green are marked those orbitals that participate in the conical intersection under study.
Table S13: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI for uracil.
Table S14: Occupation numbers for each one of the molecular orbtials involved in the different optimizations of the conical intersection ( 1 n O π * / 1 ππ * ) CI of uracil.In green are marked those orbitals that participate in the conical intersection under study.
Table S15: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * /S 0 ) CI for uracil.
Table S16: Occupation numbers for each one of the molecular orbtials involved in the different optimizations of the conical intersection ( 1 n O π * /S 0 ) CI of uracil.In green are marked those orbitals that participate in the conical intersection under study.
As expected, NOONs in Tables S12, S14 and S16 do not explain the differences in the classification of the conical intersections of uracil since the observed changes do not correlate with the P and B results.

Thymine
In this case, the orbitals included in the optimizations using an active space of (14,10) are shown in the figure S3.For the rest of the cases, the orbitals can be found in the tables S17-S22.Table S17: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 ππ * /S 0 ) CI for thymine.
Table S18: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 ππ * /S 0 ) CI of thymine.In green are marked those orbitals that participate in the conical intersection under study.
Table S19: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI for thymine.
Table S20: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 n O π * / 1 ππ * ) CI of thymine.In green are marked those orbitals that participate in the conical intersection under study.
Table S21: Molecular orbitals included in the active spaces used for the optimization of the conical intersection ( 1 n O π * /S 0 ) CI for thymine.
Table S22: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection ( 1 n O π * / 1 ππ * ) CI of thymine.In green are marked those orbitals that participate in the conical intersection under study.
The findings for the conical intersections of thymine mirror the trends observed for uracil and cytosine, further illustrating that the subtle shifts in NOONs do not provide a satisfactory explanation for the diversities in conical intersection classifications, as the changes in orbital occupancies do not correlate with P and B results.

Guanine
As it is well known, the active spaces in the case of purine nucleobases are larger than those of pyrimidine nucleobases.In the specific case of guanine we have used only for the optimization of (L a ( 1 ππ * )/S 0 ) CI conical intersection, due to the computational cost of the calculations, an active space of 20 electrons distributed in 14 orbitals.These orbitals are shown in Figure S4 while for the rest of the cases in which other active spaces have been used and for each conical intersection we have different tables specifying which MOs are included and how NOONs change.Table S23: Molecular orbitals included in the active spaces used for the optimization of the conical intersection (L a ( 1 ππ * )/S 0 ) CI for guanine.
Table S24: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection (L a ( 1 ππ * )/S 0 ) CI of guanine.In green are marked those orbitals that participate in the conical intersection under study.
Table S25: Molecular orbitals included in the active spaces used for the optimization of the conical intersection (L a ( 1 ππ * )/L b ( 1 ππ * )) CI for guanine.
Table S26: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection (L a ( 1 ππ * )/L b ( 1 ππ * )) CI of guanine.In green are marked those orbitals that participate in the conical intersection under study.
different classifications within the same conical intersection, nor do they give information on how similar are those in the same quadrants.

Adenine
For adenine, the larger active space used was 18 electrons distributed in 13 orbitals as can be seen in Figure S5.Following the same system as before, we have a  Table S31: Molecular orbitals included in the active spaces used for the optimization of the conical intersection (L a ( 1 ππ * )/S 0 ) CI for adenine.
Table S32: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection (L a ( 1 ππ * )/S 0 ) CI of adenine.In green are marked those orbitals that participate in the conical intersection under study.
Table S35: Molecular orbitals included in the active spaces used for the optimization of the conical intersection (L b ( 1 ππ * )/ 1 n N π * ) CI for adenine.
Table S36: Occupation numbers for each one of the molecular orbitals involved in the different optimizations of the conical intersection (L b ( 1 ππ * )/ 1 n N π * ) CI of adenine.Those orbitals that participate in the conical intersection have their cell in green.
Finally, for the adenine (L a ( 1 ππ * )/S 0 ) CI conical intersection we found similar results to those for the cytosine ( 1 ππ * /S 0 ) CI (Table S2).Larger active spaces, have one of the electrons in Homo (occ ∼2) that goes to the Lumo in which the occupancies change to ∼1 for both molecular orbitals, with a difference situation as the active space is reduced.However, these changes do not explain the differences observed in the topography of the different optimized intersections.The same conclusions are drawn for the last two conical intersections studied, as the small changes observed in the occupations are not related to the quadrant changes of the structures.

Figure S1 :
Figure S1: Valence π and n O/N occupied and π unoccupied molecular orbitals of Cytosine, together with their labelling.

Figure S2 :
Figure S2: Valence π and n O occupied and π unoccupied molecular orbitals of Uracil, together with their labelling.

Figure S3 :
Figure S3: Valence π and n O occupied and π unoccupied molecular orbitals of Thymine, together with their labelling.

Figure S4 :
Figure S4: Valence π and n O/N occupied and π unoccupied molecular orbitals of Guanine, together with their labelling.Orbital n N 1 is removed from almost all active spaces as its occupation number (and therefore its contribution) is negligible.

Figure S5 :
Figure S5: Valence π and n O/N occupied and π unoccupied molecular orbitals of Adenine, together with their labelling.Orbital n N 1 is removed from almost all active spaces as its occupation number (and therefore its contribution) is negligible.
Figure S6: P and B parameters of ( 1 ππ * /S 0 ) CI using multiple different active spaces and triple-ζ basis set (see Computational Details) for a) cytosine, b) uracil and c) thymine.Active space size is denoted by both marker size and the contour gradient colour provided in the right hand side of each panel.A picture with the superimposed geometries of all optimised conical intersections are provided as in-sets, with the coloured structures representing the outlier intersections marked with a square.
Figure S10: P and B parameters of (L a ( 1 ππ * )/S 0 ) CI for a) guanine and b) adenine using multiple different active spaces and triple-ζ basis set (see Computational Details).Active space size is denoted by both marker size and the contour gradient colour provided in the right hand side of each panel.A picture with the superimposed geometries of all optimised conical intersections are provided as in-sets, with the coloured structures representing the outlier intersections marked with a square.

Figure S12 :
Figure S12: P and B parameters of (L b ( 1 ππ * )/ 1 n N π * ) CI (a) for adenine and (L a ( 1 ππ * )/ 1 n O π * ) CI (b) and (L b ( 1 ππ * )/ 1 n N π * ) CI (c) for guanine using multiple different active spaces and triple-ζ basis set (see Computational Details).Active space size is denoted by both marker size and the contour gradient colour provided in the right hand side of each panel.A picture with the superimposed geometries of all optimised conical intersections are provided as in-sets, with the coloured structures representing the outlier intersection marked with a square.

Figure S14 :Figure S15 :
Figure S14: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-N 1 -C 4 -O atoms for the different active spaces used in the optimization of the conical intersection ( 1 ππ * /S 0 ) CI of uracil.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S16 :
Figure S16: Root Mean Squared Deviation (RMSD) and dihedral angle variation between O-C 2 -C 5 -H atoms for the different active spaces used in the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI of cytosine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S17 :
Figure S17: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-C 6 -N 3 -H atoms for the different active spaces used in the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI of uracil.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S18 :
Figure S18: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-N 3 -C 6 -H atoms for the different active spaces used in the optimization of the conical intersection ( 1 n O π * / 1 ππ * ) CI of thymine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S19 :
Figure S19: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-N 1 -C 4 -N atoms for the different active spaces used in the optimization of the conical intersection ( 1 n O π * /S 0 ) CI of cytosine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S20 :Figure S21 :
Figure S20: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-N 3 -C 6 -H atoms for the different active spaces used in the optimization of the conical intersection ( 1 n O π * /S 0 ) CI of uracil.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S22 :
Figure S22: Root Mean Squared Deviation (RMSD) and dihedral angle variation between O-C 2 -C 5 -H atoms for the different active spaces used in the optimization of the conical intersection ( 1 n N π * /S 0 ) CI of cytosine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S23 :
Figure S23: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-N 1 -C 4 -N atoms for the different active spaces used in the optimization of the conical intersection ( 1 n N π * / 1 ππ * ) CI of cytosine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S26 :
Figure S26: Root Mean Squared Deviation (RMSD) and dihedral angle variation between N 1 -C 2 -N-H atoms for the different active spaces used in the optimization of the conical intersection (L a ( 1 ππ * )/L b ( 1 ππ * )) CI of guanine.Yellow line highlights atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S27 :
Figure S27: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-C 2 -C 8 -H atoms for the different active spaces used in the optimization of the conical intersection (L a ( 1 ππ * )/L b ( 1 ππ * )) CI of adenine.Yellow symbols highlight atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S28 :
Figure S28: Root Mean Squared Deviation (RMSD) and dihedral angle variation between C 5 -C 6 -N-H atoms for the different active spaces used in the optimization of the conical intersection (L b ( 1 ππ * )/ 1 n N π * ) CI of adenine.Yellow line highlights atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S29 :
Figure S29: Root Mean Squared Deviation (RMSD) and dihedral angle variation between O-C 6 -N 1 -H atoms for the different active spaces used in the optimization of the conical intersection (L a ( 1 ππ * )/ 1 n O π * ) CI of guanine.Yellow line highlights atoms in the dihedral angle in the superimposed geometries (Inset).

Figure S30 :
Figure S30: Root Mean Squared Deviation (RMSD) and dihedral angle variation between H-C 8 -N 9 -H atoms for the different active spaces used in the optimization of the conical intersection (L b ( 1 ππ * )/ 1 n N π * ) CI of guanine.Yellow line highlights atoms in the dihedral angle in the superimposed geometries (Inset).
table specifying which MOs are included in the different calculations and other tables with information about NOONs.